Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/34119/21

Hierarchical Phoneme Classification for Improved Speech Recognition

Research 2021 : April - June

Donghoon Oh, Jeong-Sik Park, Ji-Hwan Kim, Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-ofthe- art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.

How to Cite this Article
Attribution/ CC Compliant Citation: Oh, Donghoon, et al. "Hierarchical Phoneme Classification for Improved Speech Recognition." Applied Sciences 11.1 (2021): 428. https://doi.org/10.3390/app11010428 http://creativecommons.org/licenses/by/4.0/ Some formatting elements, header, footer, logos, dates and pagination were modified while adapting this article.
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/34119/21

Hierarchical Phoneme Classification for Improved Speech Recognition

How to Cite this Article

Links

Contact Us